Skip to content

feat(daemonset): add restart and rollout-aware wait methods#2753

Open
rnetser wants to merge 4 commits into
mainfrom
feature/issue-2752-daemonset-restart-rollout
Open

feat(daemonset): add restart and rollout-aware wait methods#2753
rnetser wants to merge 4 commits into
mainfrom
feature/issue-2752-daemonset-restart-rollout

Conversation

@rnetser

@rnetser rnetser commented Jul 1, 2026

Copy link
Copy Markdown
Collaborator

What

Add two new methods to the DaemonSet class:

  • restart() — Patches the pod template with a kubectl.kubernetes.io/restartedAt annotation, triggering a rolling restart (equivalent to oc rollout restart).
  • wait_for_rollout() — Waits until the DaemonSet rollout is complete by checking:
    • observedGeneration == metadata.generation (controller has seen latest spec)
    • updatedNumberScheduled == desiredNumberScheduled (all pods updated)
    • numberAvailable == desiredNumberScheduled (all pods available, respecting minReadySeconds)

Why

wait_until_deployed only checks desiredNumberScheduled == numberReady, which is racy after a rollout restart — old pods are still ready so it returns immediately before any pod is replaced. This forced consumers to shell out to oc rollout restart + oc rollout status.

Tests

5 unit tests covering:

  • Restart annotation patching
  • Rollout complete (happy path)
  • Partial update wait (updatedNumberScheduled < desired)
  • Generation mismatch wait (observedGeneration != generation)
  • Available vs ready distinction (numberAvailable < desired while numberReady == desired)

Closes #2752

Summary by CodeRabbit

  • New Features

    • Added DaemonSet restart support, allowing workloads to be restarted by updating the pod template annotation.
    • Added rollout monitoring so DaemonSet updates can be checked until they are fully available and up to date.
  • Tests

    • Added coverage for restart behavior.
    • Added coverage for rollout completion and waiting behavior across several status conditions.

Add restart() method that patches the pod template with a
kubectl.kubernetes.io/restartedAt annotation, triggering a rolling
restart (equivalent to `oc rollout restart`).

Add wait_for_rollout() method that checks observedGeneration,
updatedNumberScheduled, and numberAvailable to reliably wait for
a DaemonSet rollout to complete (unlike wait_until_deployed which
only checks numberReady and is racy after restarts).

Closes #2752

Co-authored-by: pi-coding-agent <noreply@pi-coding-agent.com>
Signed-off-by: rnetser <rnetser@redhat.com>
@coderabbitai

coderabbitai Bot commented Jul 1, 2026

Copy link
Copy Markdown

Review Change Stack

Warning

Review limit reached

@rnetser, you've reached your PR review limit, so we couldn't start this review.

Next review available in: 37 minutes

Enable usage-based reviews in Billing to review now. Otherwise, wait until the next included review is available.
You're only billed for reviews past your plan's rate limits ($0.25/file).

How can I continue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based reviews.

How do review limits work?

CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan review availability.

For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, additional reviews become available more gradually as earlier reviews age out of the rolling window.

Please refer docs for additional details.

Review details
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: a2a4daea-7647-40d8-b8a3-38a177331703

📥 Commits

Reviewing files that changed from the base of the PR and between cd50525 and 0be7b37.

📒 Files selected for processing (2)
  • ocp_resources/daemonset.py
  • tests/test_daemonset.py

Walkthrough

Adds restart() and wait_for_rollout() methods to the DaemonSet class in ocp_resources/daemonset.py, enabling annotation-based pod template restarts and polling for rollout completion based on generation, scheduling, and availability status. Adds a corresponding test module covering both new methods.

Changes

DaemonSet restart and rollout wait

Layer / File(s) Summary
Restart and rollout wait implementation
ocp_resources/daemonset.py
Adds restart(), which patches the pod template with a kubectl.kubernetes.io/restartedAt annotation, and wait_for_rollout(timeout=TIMEOUT_4MINUTES), which polls via TimeoutSampler until observed generation, updated scheduling, and available pod counts match the desired scheduled count.
Unit tests for restart and wait_for_rollout
tests/test_daemonset.py
Adds a daemonset fixture, a mocked API response helper, and tests verifying restart() triggers a single patched update() call and wait_for_rollout() waits or returns correctly based on generation, scheduling, and availability conditions.

Estimated code review effort: 2 (Simple) | ~12 minutes

Related issues: #2752 feat(daemonset): add restart and rollout-aware wait methods

Suggested labels: enhancement, tests

Suggested reviewers: RedHatQE/openshift-python-wrapper maintainers

🐰 A daemon set stirs beneath the hood,
With restart stamps and status understood,
Rolling out its pods anew,
Waiting till the counts ring true,
Then hops away, its rollout good.

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title is concise and accurately summarizes the new DaemonSet restart and rollout-wait behavior.
Description check ✅ Passed The description covers the feature, rationale, tests, and linked issue, though it doesn't use the repo's exact template headings.
Linked Issues check ✅ Passed The changes add the requested restart and rollout-aware wait methods plus tests, matching the linked issue's objectives.
Out of Scope Changes check ✅ Passed Only DaemonSet code and its tests were changed, and both are directly related to the requested feature.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feature/issue-2752-daemonset-restart-rollout

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@rh-bot-1

rh-bot-1 commented Jul 1, 2026

Copy link
Copy Markdown

Report bugs in Issues

Welcome! 🎉

This pull request will be automatically processed with the following features:

🔄 Automatic Actions

  • Reviewer Assignment: Reviewers are automatically assigned based on the OWNERS file in the repository root
  • Size Labeling: PR size labels (XS, S, M, L, XL, XXL) are automatically applied based on changes
  • Issue Creation: Disabled for this repository
  • Branch Labeling: Branch-specific labels are applied to track the target branch
  • Auto-verification: Auto-verified users have their PRs automatically marked as verified
  • Labels: All label categories are enabled (default configuration)

📋 Available Commands

PR Status Management

  • /wip - Mark PR as work in progress (adds WIP: prefix to title)
  • /wip cancel - Remove work in progress status
  • /hold - Block PR merging (approvers only)
  • /hold cancel - Unblock PR merging
  • /verified - Mark PR as verified
  • /verified cancel - Remove verification status
  • /reprocess - Trigger complete PR workflow reprocessing (useful if webhook failed or configuration changed)
  • /regenerate-welcome - Regenerate this welcome message
  • /security-override - Set security check runs to pass (maintainers only)
  • /security-override cancel - Re-run security checks

Review & Approval

  • /lgtm - Approve changes (looks good to me)
  • /approve - Approve PR (approvers only)
  • /automerge - Enable automatic merging when all requirements are met (maintainers and approvers only)
  • /assign-reviewers - Assign reviewers based on OWNERS file
  • /assign-reviewer @username - Assign specific reviewer
  • /check-can-merge - Check if PR meets merge requirements

Testing & Validation

  • /retest tox - Run Python test suite with tox
  • /retest python-module-install - Test Python package installation
  • /retest conventional-title - Validate commit message format
  • /retest all - Run all available tests

Cherry-pick Operations

  • /cherry-pick <branch> - Schedule cherry-pick to target branch when PR is merged
    • Multiple branches: /cherry-pick branch1 branch2 branch3
  • /cherry-pick-retry <branch> - Retry a failed cherry-pick (merged PRs only)

Branch Management

  • /rebase - Rebase this PR branch onto its base branch

Label Management

  • /<label-name> - Add a label to the PR
  • /<label-name> cancel - Remove a label from the PR

✅ Merge Requirements

This PR will be automatically approved when the following conditions are met:

  1. Approval: /approve from at least one approver
  2. Status Checks: All required status checks must pass
  3. No Blockers: No wip, hold, has-conflicts labels and PR must be mergeable (no conflicts)
  4. Verified: PR must be marked as verified

📊 Review Process

Approvers and Reviewers

Approvers:

  • myakove
  • rnetser

Reviewers:

  • myakove
  • rnetser
Available Labels
  • hold
  • verified
  • wip
  • lgtm
  • approve
  • automerge
AI Features
  • Conventional Title: Mode: fix (claude/claude-opus-4-6-1m)
  • Cherry-Pick Conflict Resolution: Enabled (claude/claude-opus-4-6-1m)
Security Checks
  • Suspicious Path Detection: Monitors paths: .claude/, .vscode/, .cursor/, .devcontainer/, .pi/, .github/workflows/, .github/actions/
  • Committer Identity Check: Verifies last committer matches PR author
  • Mandatory: Security checks block merge (use /security-override to bypass — maintainers only)

💡 Tips

  • WIP Status: Use /wip when your PR is not ready for review
  • Verification: The verified label is removed on new commits unless the push is detected as a clean rebase
  • Cherry-picking: Cherry-pick labels are processed when the PR is merged
  • Permission Levels: Some commands require approver permissions
  • Auto-verified Users: Certain users have automatic verification and merge privileges

For more information, please refer to the project documentation or contact the maintainers.

@qodo-code-review

Copy link
Copy Markdown

PR Summary by Qodo

Add DaemonSet restart() and rollout-aware wait_for_rollout()

✨ Enhancement 🧪 Tests 🕐 20-40 Minutes

Grey Divider

AI Description

• Add DaemonSet.restart() to trigger rolling restarts via restartedAt template annotation.
• Add DaemonSet.wait_for_rollout() to wait for observed generation, updated, and available pods.
• Add unit tests covering restart patching and rollout completion/wait scenarios.
Diagram

graph TD
  Caller["Consumer code"] --> DaemonSet["DaemonSet class"] --> K8sAPI{{"Kubernetes API"}}
  DaemonSet --> Sampler["TimeoutSampler"] --> K8sAPI
  K8sAPI --> Status["DaemonSet status"]
  Tests[/"Unit tests"/] --> DaemonSet
  Tests --> Sampler

  subgraph Legend
    direction LR
    _mod["Module"] ~~~ _ext{{"External"}} ~~~ _test[/"Tests"/]
  end
Loading
High-Level Assessment

The following are alternative approaches to this PR:

1. Use DaemonSet status.conditions instead of counters
  • ➕ Could align more closely with Kubernetes-native rollout signaling
  • ➕ Potentially fewer numeric edge cases across Kubernetes versions
  • ➖ Condition semantics can vary and are less deterministic for 'all pods updated' vs 'available' requirements
  • ➖ May still require additional numeric checks for correctness
2. Watch-based rollout waiting (K8s watch/informer)
  • ➕ Lower API polling overhead and faster reaction to changes
  • ➕ More scalable for long rollouts
  • ➖ More implementation complexity and error handling (reconnects, resourceVersion, timeouts)
  • ➖ Harder to unit test than deterministic polling
3. Shell out to kubectl/oc rollout status
  • ➕ Delegates rollout logic to mature CLI behavior
  • ➕ Minimal code in library
  • ➖ Adds runtime dependency on external binaries and cluster-specific tooling
  • ➖ Harder to use in restricted environments and less portable in Python APIs

Recommendation: Current approach (restart via restartedAt annotation + polling on observedGeneration/updatedNumberScheduled/numberAvailable) is a pragmatic, portable solution that avoids CLI dependencies and fixes the known race in wait_until_deployed after restarts. If future issues arise across cluster versions, consider augmenting with condition checks, but keep numeric gates as the primary source of truth.

Files changed (2) +191 / -0

Enhancement (1) +56 / -0
daemonset.pyAdd DaemonSet restart() and rollout-aware wait_for_rollout() +56/-0

Add DaemonSet restart() and rollout-aware wait_for_rollout()

• Introduces restart() to trigger a rolling restart by patching the pod template with a kubectl.kubernetes.io/restartedAt timestamp. Adds wait_for_rollout() that polls the DaemonSet until the controller observed the latest generation and all desired pods are updated and available, avoiding the ready-count race in wait_until_deployed.

ocp_resources/daemonset.py

Tests (1) +135 / -0
test_daemonset.pyAdd unit tests for DaemonSet restart and rollout waiting +135/-0

Add unit tests for DaemonSet restart and rollout waiting

• Adds pytest-based unit tests validating restart() patch structure and wait_for_rollout() behavior across completion and several waiting scenarios by mocking TimeoutSampler iteration results.

tests/test_daemonset.py

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
ocp_resources/daemonset.py (1)

28-42: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Duplicate TimeoutSampler construction between wait_until_deployed and wait_for_rollout.

Both methods build an identical TimeoutSampler (same wait_timeout, sleep, exceptions_dict, func, field_selector, namespace) and only differ in the loop condition. Consider extracting a small private helper (e.g. _daemonset_samples(timeout)) to avoid duplication and keep the two wait strategies in sync going forward.

Also applies to: 63-97

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@ocp_resources/daemonset.py` around lines 28 - 42, Both wait methods are
duplicating the same TimeoutSampler setup, so extract the shared construction
into a small private helper on DaemonSet (for example, a helper used by
wait_until_deployed and wait_for_rollout) and have both methods reuse it. Keep
the existing loop-specific readiness checks in each method, but move the common
sampler arguments and api.get call into the helper so the two wait strategies
stay consistent.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@ocp_resources/daemonset.py`:
- Around line 63-97: Update the docstring for DaemonSet.wait_for_rollout to
match the actual readiness check used by the implementation. The current wording
says the updated pods are “ready,” but the method returns based on
status.numberAvailable in the rollout condition, so revise the text to describe
availability rather than readiness and keep the explanation aligned with the
observedGeneration, updatedNumberScheduled, and numberAvailable checks.

---

Nitpick comments:
In `@ocp_resources/daemonset.py`:
- Around line 28-42: Both wait methods are duplicating the same TimeoutSampler
setup, so extract the shared construction into a small private helper on
DaemonSet (for example, a helper used by wait_until_deployed and
wait_for_rollout) and have both methods reuse it. Keep the existing
loop-specific readiness checks in each method, but move the common sampler
arguments and api.get call into the helper so the two wait strategies stay
consistent.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 524c442e-078d-4619-9c84-af4d42c7516a

📥 Commits

Reviewing files that changed from the base of the PR and between 3d29c21 and cd50525.

📒 Files selected for processing (2)
  • ocp_resources/daemonset.py
  • tests/test_daemonset.py

Comment thread ocp_resources/daemonset.py
@qodo-code-review

qodo-code-review Bot commented Jul 1, 2026

Copy link
Copy Markdown

Code Review by Qodo

🐞 Bugs (0) 📘 Rule violations (0) 📎 Requirement gaps (0) 📜 Skill insights (0)

Context used

Grey Divider


Action required

1. Status dereference crash ✓ Resolved 🐞 Bug ☼ Reliability
Description
wait_for_rollout() assumes item.status and its fields are always present; if status is missing/None
(common early in reconciliation), the method can raise AttributeError/TypeError and abort waiting
instead of continuing until rollout completes.
Code

ocp_resources/daemonset.py[R86-94]

+            if sample.items:
+                item = sample.items[0]
+                status = item.status
+                desired_number_scheduled = status.desiredNumberScheduled
+                if (
+                    desired_number_scheduled > 0
+                    and status.observedGeneration == item.metadata.generation
+                    and status.updatedNumberScheduled == desired_number_scheduled
+                    and status.numberAvailable == desired_number_scheduled
Relevance

⭐⭐⭐ High

Team accepted polling hardening for missing/None status to avoid AttributeError (PR #2499).

PR-#2499

ⓘ Recommendations generated based on similar findings in past PRs

Evidence
The new implementation dereferences item.status and status.desiredNumberScheduled without null
checks, which can throw if status is not populated. The repo previously fixed the same polling
anti-pattern by sampling self.exists and reading status via to_dict().get(...) to avoid
AttributeError when status is absent.

/ocp_resources/daemonset.py[85-96]
/ocp_resources/resource.py[1010-1025]
PR-#2499

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
`DaemonSet.wait_for_rollout()` directly dereferences `item.status` and several status fields without guarding for missing/None values. This can raise runtime exceptions during transient states (e.g., before the controller has populated `status`), causing the wait to fail early.

### Issue Context
The repository already uses a defensive pattern for waiting on status by sampling `self.exists` and extracting `status` via `to_dict()` + `.get(...)` to avoid attribute errors.

### Fix Focus Areas
- /ocp_resources/daemonset.py[76-96]
- /ocp_resources/resource.py[1010-1025]

### Suggested fix
- In `wait_for_rollout`, treat missing `status` or missing numeric fields as non-ready and `continue` polling rather than dereferencing.
- Prefer extracting values from `item.to_dict()` (or `sample.to_dict()` if you change the sampler to `func=lambda: self.exists`) and defaulting absent numeric fields to `0`.
- Add/extend unit tests to cover a sample where `status` is None/missing and ensure the loop continues until a valid complete state is observed.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools



Remediation recommended

2. Zero-desired rollout timeout ✓ Resolved 🐞 Bug ≡ Correctness
Description
wait_for_rollout() requires desiredNumberScheduled > 0 to return, so a DaemonSet with
desiredNumberScheduled==0 (e.g., no schedulable/matching nodes) will always time out even though
rollout is effectively complete.
Code

ocp_resources/daemonset.py[R89-95]

+                desired_number_scheduled = status.desiredNumberScheduled
+                if (
+                    desired_number_scheduled > 0
+                    and status.observedGeneration == item.metadata.generation
+                    and status.updatedNumberScheduled == desired_number_scheduled
+                    and status.numberAvailable == desired_number_scheduled
+                ):
Relevance

⭐⭐ Medium

No clear precedent on desired==0 rollout completion; existing patterns often gate on desired>0.

ⓘ Recommendations generated based on similar findings in past PRs

Evidence
The DaemonSet implementation explicitly blocks completion for desired==0, while an analogous wait
helper (Deployment.wait_for_replicas) returns successfully in zero-desired/zero-actual scenarios to
avoid permanent timeouts.

/ocp_resources/daemonset.py[89-96]
/ocp_resources/deployment.py[375-385]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
`DaemonSet.wait_for_rollout()` hard-requires `desiredNumberScheduled > 0` before it can ever return. This makes the method hang until timeout for valid DaemonSets that currently have zero desired pods.

### Issue Context
Other rollout/wait helpers in this repo treat “desired == 0” as a valid terminal state and return successfully rather than timing out.

### Fix Focus Areas
- /ocp_resources/daemonset.py[89-96]
- /ocp_resources/deployment.py[375-385]

### Suggested fix
- Remove the `desired_number_scheduled > 0` guard, and instead allow completion when all compared values match even if they are 0.
- If you still want an explicit early-return case, treat `desiredNumberScheduled == 0` as complete once `observedGeneration == metadata.generation` (and optionally `updatedNumberScheduled == 0` and `numberAvailable == 0`).
- Add a unit test for `desired=0` to ensure `wait_for_rollout()` returns promptly rather than timing out.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools



Informational

3. Tests assert numberAvailable rollout ✗ Dismissed 📎 Requirement gap ⚙ Maintainability
Description
The new unit tests for wait_for_rollout() validate completion/waiting behavior based on
numberAvailable rather than the required numberReady condition. This means the test suite does
not cover the compliance-specified rollout-ready predicate and will not prevent regressions against
it.
Code

tests/test_daemonset.py[R104-113]

+    def test_wait_for_rollout_waits_when_not_all_available(self, daemonset):
+        not_available_response = _make_api_response(
+            generation=2,
+            observed_generation=2,
+            desired=3,
+            updated=3,
+            available=1,
+        )
+        not_available_response.items[0].status.numberReady = 3  # ready but not available yet
+
Relevance

⭐ Low

No precedent; tests align with PR’s numberAvailable intent, so likely kept as-is.

ⓘ Recommendations generated based on similar findings in past PRs

Evidence
PR Compliance ID 4 requires tests to verify that wait_for_rollout() checks the specified rollout
conditions (including numberReady == desiredNumberScheduled). The added test
test_wait_for_rollout_waits_when_not_all_available explicitly sets numberReady to the desired
value but still expects waiting, demonstrating the tests are keyed to numberAvailable instead of
the required numberReady condition.

Automated tests added for DaemonSet.restart() and DaemonSet.wait_for_rollout()
DaemonSet provides a rollout-aware wait_for_rollout() method using generation and updated/ready scheduling checks
tests/test_daemonset.py[104-113]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
The `wait_for_rollout()` tests are written around `status.numberAvailable` semantics (including a case where `numberReady == desired` but the test expects to keep waiting). Compliance requires `wait_for_rollout()` to use `numberReady`.

## Issue Context
After updating `wait_for_rollout()` to check `numberReady`, the test cases should assert waiting/completion based on `numberReady` rather than `numberAvailable`.

## Fix Focus Areas
- tests/test_daemonset.py[13-25]
- tests/test_daemonset.py[44-135]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


4. wait_for_rollout() checks numberAvailable ✗ Dismissed 📎 Requirement gap ≡ Correctness
Description
DaemonSet.wait_for_rollout() uses status.numberAvailable == desiredNumberScheduled instead of
the required status.numberReady == desiredNumberScheduled, so it does not meet the specified
rollout-ready criteria. This is a direct mismatch with the compliance success criteria for the new
rollout-aware method.
Code

ocp_resources/daemonset.py[R89-95]

+                desired_number_scheduled = status.desiredNumberScheduled
+                if (
+                    desired_number_scheduled > 0
+                    and status.observedGeneration == item.metadata.generation
+                    and status.updatedNumberScheduled == desired_number_scheduled
+                    and status.numberAvailable == desired_number_scheduled
+                ):
Relevance

⭐ Low

No precedent; PR explicitly intends numberAvailable (minReadySeconds), so changing to numberReady
likely rejected.

ⓘ Recommendations generated based on similar findings in past PRs

Evidence
PR Compliance ID 2 requires status.numberReady == status.desiredNumberScheduled as a success
condition. The added code instead checks status.numberAvailable == desiredNumberScheduled, so it
does not implement the required readiness predicate.

DaemonSet provides a rollout-aware wait_for_rollout() method using generation and updated/ready scheduling checks
ocp_resources/daemonset.py[89-95]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`DaemonSet.wait_for_rollout()` currently validates `status.numberAvailable == desiredNumberScheduled`, but the compliance requirement explicitly requires validating `status.numberReady == desiredNumberScheduled`.

## Issue Context
PR Compliance ID 2 defines the rollout completion criteria as: (1) `observedGeneration == generation`, (2) `updatedNumberScheduled == desiredNumberScheduled`, and (3) `numberReady == desiredNumberScheduled`.

## Fix Focus Areas
- ocp_resources/daemonset.py[89-95]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


Grey Divider

Qodo Logo

Comment thread ocp_resources/daemonset.py Outdated
Comment thread ocp_resources/daemonset.py Outdated
- Guard against missing/None status during early reconciliation
- Handle zero-desired DaemonSets (no matching nodes) without timeout
- Use `or 0` fallbacks for optional status fields
- Fix docstring: "ready" → "available"
- Add tests for status-missing and zero-desired edge cases

Co-authored-by: pi-coding-agent <noreply@pi-coding-agent.com>
Signed-off-by: rnetser <rnetser@redhat.com>
@rnetser

rnetser commented Jul 1, 2026

Copy link
Copy Markdown
Collaborator Author

@qodo-code-review[bot]

The following review comments were reviewed and a decision was made:

ocp_resources/daemonset.py:86 (qodo bug) — Status dereference crash

Addressed: Fixed in 52667bd — added if not status: continue guard and or 0 fallbacks on optional fields (desiredNumberScheduled, updatedNumberScheduled, numberAvailable). Added test for status=None case.

ocp_resources/daemonset.py:89 (qodo bug) — Zero-desired rollout timeout

Addressed: Fixed in 52667bd — added early return when desiredNumberScheduled == 0 and observedGeneration matches. Added test for zero-desired case.

ocp_resources/daemonset.py:89 (qodo requirement gap) — wait_for_rollout() checks numberAvailable

Addressed: By design per updated issue #2752 spec — using numberAvailable instead of numberReady to respect minReadySeconds, matching what oc rollout status checks. Issue spec updated to document this design decision.

tests/test_daemonset.py:104 (qodo requirement gap) — Tests assert numberAvailable rollout

Addressed: By design per updated issue #2752 spec — numberAvailable is correct (respects minReadySeconds). Issue spec updated to reflect this. Tests correctly validate numberAvailable behavior.

@rnetser

rnetser commented Jul 1, 2026

Copy link
Copy Markdown
Collaborator Author

@coderabbitai[bot]

The following review comments were reviewed and a decision was made:

ocp_resources/daemonset.py:28 (nitpick) — Duplicate TimeoutSampler construction between wait_until_deployed and wait_for_rollout.

Skipped: Intentional pattern — both wait methods follow the same TimeoutSampler construction pattern used throughout the codebase (e.g., Deployment.wait_for_replicas). Extracting a helper adds complexity without real benefit since each method has a different termination condition.

@qodo-code-review

Copy link
Copy Markdown

code-review[bot]

The following review comments were reviewed and a decision was made:

ocp_resources/daemonset.py:86 (qodo bug) — Status dereference crash

Addressed: Fixed in 52667bd — added if not status: continue guard and or 0 fallbacks on optional fields (desiredNumberScheduled, updatedNumberScheduled, numberAvailable). Added test for status=None case.

ocp_resources/daemonset.py:89 (qodo bug) — Zero-desired rollout timeout

Addressed: Fixed in 52667bd — added early return when desiredNumberScheduled == 0 and observedGeneration matches. Added test for zero-desired case.

ocp_resources/daemonset.py:89 (qodo requirement gap) — wait_for_rollout() checks numberAvailable

Addressed: By design per updated issue #2752 spec — using numberAvailable instead of numberReady to respect minReadySeconds, matching what oc rollout status checks. Issue spec updated to document this design decision.

tests/test_daemonset.py:104 (qodo requirement gap) — Tests assert numberAvailable rollout

Addressed: By design per updated issue #2752 spec — numberAvailable is correct (respects minReadySeconds). Issue spec updated to reflect this. Tests correctly validate numberAvailable behavior.

Yes — the rollout logic now looks consistent with the intended behavior.

  • finding 1 and finding 2 are fixed in 52667bd7.
  • finding 3 and finding 4 are intentionally using numberAvailable, which is the right signal if we want to respect minReadySeconds and match oc rollout status semantics.
  • The added tests cover the missing-status case, zero-desired DaemonSets, and the numberAvailable path.

So from my side, this looks resolved as a design choice rather than a bug.

@coderabbitai

coderabbitai Bot commented Jul 1, 2026

Copy link
Copy Markdown

Tip

For best results, initiate chat on the files or code changes.

@rnetser, understood — thanks for the context. Keeping the two TimeoutSampler constructions separate for consistency with existing patterns like Deployment.wait_for_replicas makes sense, and I agree extracting a helper isn't worth the added indirection here.

@rnetser

rnetser commented Jul 1, 2026

Copy link
Copy Markdown
Collaborator Author

@qodo-code-review

Please re-evaluate all remaining sticky findings against the current code.
For each finding, check if the referenced code has been fixed in subsequent commits.
Remove findings that are fully addressed. Keep any findings where the issue is still present in the code.

@qodo-code-review

Copy link
Copy Markdown

code-review

Please re-evaluate all remaining sticky findings against the current code.
For each finding, check if the referenced code has been fixed in subsequent commits.
Remove findings that are fully addressed. Keep any findings where the issue is still present in the code.

I re-checked the current code, and both remaining sticky findings are still present.

  • finding 3 — still valid: tests/test_daemonset.py:104-113 continues to assert rollout completion via numberAvailable.
  • finding 4 — still valid: ocp_resources/daemonset.py:89-95 still treats rollout as complete when numberAvailable == desiredNumberScheduled.

No remaining finding is fully addressed yet, so I didn’t remove any.

@rnetser

rnetser commented Jul 1, 2026

Copy link
Copy Markdown
Collaborator Author

@qodo-code-review

Please re-evaluate all remaining sticky findings against the current code.
For each finding, check if the referenced code has been fixed in subsequent commits.
Remove findings that are fully addressed. Keep any findings where the issue is still present in the code.

@qodo-code-review

Copy link
Copy Markdown

code-review

Please re-evaluate all remaining sticky findings against the current code.
For each finding, check if the referenced code has been fixed in subsequent commits.
Remove findings that are fully addressed. Keep any findings where the issue is still present in the code.

I re-checked the remaining sticky findings against the current code.

Keep

  • finding 3 — still present. The tests still assert the rollout behavior using numberAvailable in tests/test_daemonset.py.
  • finding 4 — still present. DaemonSet.wait_for_rollout() in ocp_resources/daemonset.py still treats numberAvailable as part of the rollout-complete condition.

Remove

  • None of the remaining sticky findings are fully addressed in the current code.

If you want, I can also help turn either of these into a concrete fix suggestion.

@rnetser

rnetser commented Jul 1, 2026

Copy link
Copy Markdown
Collaborator Author

@coderabbitai review

@rnetser

rnetser commented Jul 1, 2026

Copy link
Copy Markdown
Collaborator Author

@qodo-code-review

Findings 3 and 4 about numberAvailable vs numberReady were confirmed as intentional design decisions in your previous response (you said 'the rollout logic now looks consistent with the intended behavior' and 'numberAvailable is the right signal'). Please remove these findings as they are resolved by design - the issue spec was updated to document this choice.

@coderabbitai

coderabbitai Bot commented Jul 1, 2026

Copy link
Copy Markdown
✅ Action performed

Review finished.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@qodo-code-review

Copy link
Copy Markdown

code-review

Findings 3 and 4 about numberAvailable vs numberReady were confirmed as intentional design decisions in your previous response (you said 'the rollout logic now looks consistent with the intended behavior' and 'numberAvailable is the right signal'). Please remove these findings as they are resolved by design - the issue spec was updated to document this choice.

Done — I dismissed finding 3 and finding 4 as intentional since the numberAvailable behavior is now documented as by design.

@Oldcat2025

Copy link
Copy Markdown

Maintenance triage note: most code checks are green, but this PR is still blocked by the external can-be-merged check, and verified is still queued.

Please check the can-be-merged failure details and clarify whether this is a merge gate, branch/rebase issue, or another repository policy requirement. Once that is resolved, please let the verified check complete before requesting final review.

The kubernetes DynamicClient.patch() requires `name` in the body
or as a parameter. Include `metadata.name` in the resource_dict
to fix ValueError on restart().

Co-authored-by: pi-coding-agent <noreply@pi-coding-agent.com>
Signed-off-by: rnetser <rnetser@redhat.com>
Co-authored-by: pi-coding-agent <noreply@pi-coding-agent.com>
Signed-off-by: rnetser <rnetser@redhat.com>
@rnetser

rnetser commented Jul 1, 2026

Copy link
Copy Markdown
Collaborator Author

/verified

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat(daemonset): add restart and rollout-aware wait methods

5 participants